Distributed Word Representation Learning for Cross-Lingual Dependency Parsing
نویسندگان
چکیده
This paper proposes to learn languageindependent word representations to address cross-lingual dependency parsing, which aims to predict the dependency parsing trees for sentences in the target language by training a dependency parser with labeled sentences from a source language. We first combine all sentences from both languages to induce real-valued distributed representation of words under a deep neural network architecture, which is expected to capture semantic similarities of words not only within the same language but also across different languages. We then use the induced interlingual word representation as augmenting features to train a delexicalized dependency parser on labeled sentences in the source language and apply it to the target sentences. To investigate the effectiveness of the proposed technique, extensive experiments are conducted on cross-lingual dependency parsing tasks with nine different languages. The experimental results demonstrate the superior cross-lingual generalizability of the word representation induced by the proposed approach, comparing to alternative comparison methods.
منابع مشابه
Annotation Projection-based Representation Learning for Cross-lingual Dependency Parsing
Cross-lingual dependency parsing aims to train a dependency parser for an annotation-scarce target language by exploiting annotated training data from an annotation-rich source language, which is of great importance in the field of natural language processing. In this paper, we propose to address cross-lingual dependency parsing by inducing latent crosslingual data representations via matrix co...
متن کاملA Distributed Representation-Based Framework for Cross-Lingual Transfer Parsing
This paper investigates the problem of cross-lingual transfer parsing, aiming at inducing dependency parsers for low-resource languages while using only training data from a resource-rich language (e.g., English). Existing model transfer approaches typically don’t include lexical features, which are not transferable across languages. In this paper, we bridge the lexical feature gap by using dis...
متن کاملCross-lingual Dependency Parsing Based on Distributed Representations
This paper investigates the problem of cross-lingual dependency parsing, aiming at inducing dependency parsers for low-resource languages while using only training data from a resource-rich language (e.g. English). Existing approaches typically don’t include lexical features, which are not transferable across languages. In this paper, we bridge the lexical feature gap by using distributed featu...
متن کاملCross-Lingual Dependency Parsing with Late Decoding for Truly Low-Resource Languages
In cross-lingual dependency annotation projection, information is often lost during transfer because of early decoding. We present an end-to-end graph-based neural network dependency parser that can be trained to reproduce matrices of edge scores, which can be directly projected across word alignments. We show that our approach to cross-lingual dependency parsing is not only simpler, but also a...
متن کاملCross-Lingual Syntactically Informed Distributed Word Representations
We develop a novel cross-lingual word representation model which injects syntactic information through dependencybased contexts into a shared cross-lingual word vector space. The model, termed CLDEPEMB, is based on the following assumptions: (1) dependency relations are largely language-independent, at least for related languages and prominent dependency links such as direct objects, as evidenc...
متن کامل